13 research outputs found

    Taming Horizontal Instability in Merge Trees: On the Computation of a Comprehensive Deformation-based Edit Distance

    Full text link
    Comparative analysis of scalar fields in scientific visualization often involves distance functions on topological abstractions. This paper focuses on the merge tree abstraction (representing the nesting of sub- or superlevel sets) and proposes the application of the unconstrained deformation-based edit distance. Previous approaches on merge trees often suffer from instability: small perturbations in the data can lead to large distances of the abstractions. While some existing methods can handle so-called vertical instability, the unconstrained deformation-based edit distance addresses both vertical and horizontal instabilities, also called saddle swaps. We establish the computational complexity as NP-complete, and provide an integer linear program formulation for computation. Experimental results on the TOSCA shape matching ensemble provide evidence for the stability of the proposed distance. We thereby showcase the potential of handling saddle swaps for comparison of scalar fields through merge trees

    Comparative Design-Choice Analysis of Color Refinement Algorithms Beyond the Worst Case

    Get PDF
    Color refinement is a crucial subroutine in symmetry detection in theory as well as practice. It has further applications in machine learning and in computational problems from linear algebra. While tight lower bounds for the worst case complexity are known [Berkholz, Bonsma, Grohe, ESA2013] no comparative analysis of design choices for color refinement algorithms is available. We devise two models within which we can compare color refinement algorithms using formal methods, an online model and an approximation model. We use these to show that no online algorithm is competitive beyond a logarithmic factor and no algorithm can approximate the optimal color refinement splitting scheme beyond a logarithmic factor. We also directly compare strategies used in practice showing that, on some graphs, queue based strategies outperform stack based ones by a logarithmic factor and vice versa. Similar results hold for strategies based on priority queues

    MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data

    Full text link
    Sequential traces of user data are frequently observed online and offline, e.g., as sequences of visited websites or as sequences of locations captured by GPS. However, understanding factors explaining the production of sequence data is a challenging task, especially since the data generation is often not homogeneous. For example, navigation behavior might change in different phases of browsing a website, or movement behavior may vary between groups of users. In this work, we tackle this task and propose MixedTrails, a Bayesian approach for comparing the plausibility of hypotheses regarding the generative processes of heterogeneous sequence data. Each hypothesis is derived from existing literature, theory or intuition and represents a belief about transition probabilities between a set of states that can vary between groups of observed transitions. For example, when trying to understand human movement in a city and given some observed data, a hypothesis assuming tourists to be more likely to move towards points of interests than locals, can be shown to be more plausible than a hypothesis assuming the opposite. Our approach incorporates such hypotheses as Bayesian priors in a generative mixed transition Markov chain model, and compares their plausibility utilizing Bayes factors. We discuss analytical and approximate inference methods for calculating the marginal likelihoods for Bayes factors, give guidance on interpreting the results, and illustrate our approach with several experiments on synthetic and empirical data from Wikipedia and Flickr. Thus, this work enables a novel kind of analysis for studying sequential data in many application areas.Comment: Published in Data Mining and Knowledge Discovery (2017) and presented at ECML PKDD 201

    Whole-genome sequencing identifies rare genotypes in COMP and CHADL associated with high risk of hip osteoarthritis.

    No full text
    To access publisher's full text version of this article click on the hyperlink belowWe performed a genome-wide association study of total hip replacements, based on variants identified through whole-genome sequencing, which included 4,657 Icelandic patients and 207,514 population controls. We discovered two rare signals that strongly associate with osteoarthritis total hip replacement: a missense variant, c.1141G>C (p.Asp369His), in the COMP gene (allelic frequency = 0.026%, P = 4.0 Ă— 10(-12), odds ratio (OR) = 16.7) and a frameshift mutation, rs532464664 (p.Val330Glyfs*106), in the CHADL gene that associates through a recessive mode of inheritance (homozygote frequency = 0.15%, P = 4.5 Ă— 10(-18), OR = 7.71). On average, c.1141G>C heterozygotes and individuals homozygous for rs532464664 had their hip replacement operation 13.5 years and 4.9 years earlier than others (P = 0.0020 and P = 0.0026), respectively. We show that the full-length CHADL transcript is expressed in cartilage. Furthermore, the premature stop codon introduced by the CHADL frameshift mutation results in nonsense-mediated decay of the mutant transcripts
    corecore